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i-H Abstract 

o 

We study recovery conditions of weighted l\ minimization for signal reconstruction from compressed 
sensing measurements when partial support information is available. We show that if at least 50% of the 
(partial) support information is accurate, then weighted l\ minimization is stable and robust under weaker 
{N) sufficient conditions than the analogous conditions for standard l\ minimization. Moreover, weighted t\ 

i— i minimization provides better upper bounds on the reconstruction error in terms of the measurement noise 

^ and the compressibility of the signal to be recovered. We illustrate our results with extensive numerical 

^ experiments on synthetic data and real audio and video signals. 

Index Terms 

> 

t-H Compressed sensing, weighted l\ minimization, adaptive recovery. 

O I. Introduction 

Compressed sensing (see, e.g., [l]-[3]) is a paradigm for effective acquisition of signals that admit 
j> sparse (or approximately sparse) representations in some transform domain. The approach can be used to 

reliably recover such signals from significantly fewer linear measurements than their ambient dimension. 
Because a wide range of natural and man-made signals — e.g., audio, natural and seismic images, video, 
and wideband radio frequency signals — are sparse or approximately sparse in appropriate transform 
domains, the potential applications of compressed sensing can be immense. 
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Let := {x G : ||x||o < k} be the set of all fc-sparse signals in R N , and let 

y:=Ax + e (1) 

be a vector of measurements where A is a known n x N measurement matrix, and e denotes additive 
noise that satisfies ||e||2 < e for some known e > 0. Compressed sensing theory states that it is possible to 
recover x G from y (given A) even when n <C iV, i.e., using very few measurements. For example, 
when e = 0, one may recover an estimate x* of the signal x as the solution of the constrained £q 
minimization problem 

minimize \\z\\o subject to Az = y. (2) 

zevi N 

In fact, using (2), any x G can be recovered perfectly using n measurements when n > 2k and A 
is in general position (see, e.g., [4]). However, £q minimization is a combinatorial problem and quickly 
becomes intractable as the dimensions increase. Instead, the convex relaxation 

minimize llzlli subject to \\Az — y\U < e (3) 

zes. N 

can be used to recover the estimate x*. Candes, Romberg and Tao [2] and Donoho [1] show that if 
n > klog(N/k), then t\ minimization (3) can stably and robustly recover x from "incomplete" and 
inaccurate measurements y = Ax + e, where A is an appropriately chosen n x N measurement matrix 
and || e || 2 < e. Note that compressed sensing is a non-adaptive data acquisition technique because the 
measurement matrix A does not depend on x, the signal being measured. Furthermore, the recovery 
method that we just described is itself non-adaptive because no information on x is used in (3). Our goal 
in this paper is to examine a recovery method that is adaptive in the sense that it exploits prior support 
information on x; the measurement process, however, remains non-adaptive. 

A. Compressed sensing with prior support information 

The i\ minimization problem (3) does not incorporate any prior information about the support of x. 
However, in many applications it may be possible to draw an estimate of the support of the signal or 
an estimate of its largest coefficients. For example, signals such as video and audio exhibit correlation 
over temporal frames that can be exploited to estimate a portion of the support using previously decoded 
frames. 

Consider the example where x G R N is a compressible signal, i.e., it can be well-approximated by 
its k largest-in-magnitude entries, where k <C N. If x represents the discrete cosine transform (DCT) or 
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wavelet coefficients of an image, then the entries of x that correspond to the low frequency subbands are 
most likely to be non-zero and carry most of the energy of the signal [5]. In such cases, it is beneficial 
to incorporate this information in the recovery algorithm when x is compressively sampled. 

B. Previous Work 

We are especially interested in methods that incorporate prior support information by replacing the t\ 
minimization in (3) with weighted l\ minimization 

minimize |U||i w subject to \\Az — ylU < e, (4) 

where w G [0, 1]^ and ||z||i, w := Yli w i\ z i\ i s tne weighted t\ norm. In particular, in the methods that we 
describe here (including our own proposed method), the main idea is to choose w such that the entries 
of x that are "expected" to be large are penalized less in this weighted objective function. 

The recovery of compressively sampled signals using prior support information has been previously 
studied in the literature; see, e.g., [6]-[ll]. In fact, the problem of sparse recovery with partially known 
support was independently introduced in three works - in von Borries et al. [6], in Vaswani and Lu [8]; 
and in Khajehnejad et al. [11]. 

The work by Borries et al. [6] demonstrated empirically that incorporating support information of 
a signal with a sparse discrete Fourier transform (DFT) allows for the number of compressed sensing 
measurements to be reduced by exactly the size of the known part of the support. Borries et al. achieve 
this by using a weighted i\ minimization approach with zero weights on the known support. 

More recently, Vaswani and Lu [7]-[9] proposed a modified compressed sensing approach that again 
incorporates known support elements using a weighted t\ minimization approach with zero weights on 
the known support. Their work derives sufficient recovery conditions for the noise free case (i.e., set 
e = in (1) and e = in (4)) that are weaker than the analogous l\ minimization conditions of [2] in 
the case where a large proportion of the support is known. This work is supplemented by a regularized 
modified compressed sensing approach that deals with noisy measurements [9]. The work of Vaswani and 
Lu was also extended by Jacques in [10] to the cases of compressible signals and noisy measurements. 
The approach of Jacques is based on studying the innovative basis pursuit denoising (/BPDN) problem, 
which minimizes weighted ^i-norm of the solution with with zero weights applied to the support estimate; 
Jacques and shows that (/BPDN) has a similar stability behavior to the unweighted t\ problem. 

A similar method is proposed by Khajehnejad et al. [11] for the recovery of compressively sampled 
signals with support information. The performance of this method is analyzed using a Grassman angle 
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approach. Prior information is denned in terms of two disjoint sets that partition {1, . . . , N}. The elements 
in the first set have a probability Pi of being nonzero, and the elements in the second set have a probability 
P2 of being nonzero, where Pi / P^. The authors propose weighted l\ minimization to recover the 
unknown vector where different weights w\ and W2 are assigned to the elements in the two sets. In 
particular, they find the class of signals x, depending on Pj and Wj, j = 1,2, which can be recovered 
with high probability using weighted l\. 

Finally, the weighted l\ minimization problem is related to the "adaptive lasso" described the statistics 
literature and studied by Zou in [12]; it is defined by 

N 

minimize \\y — Az\\\ + \ n ^ Wj\zj\, 

3=1 

where X n varies with the sample size n such that \nj\fn — > and A^ 7-1 )/ 2 — > 00 for some 7 > 0, 
and the weights Wj = l/|xj| 7 , where x is given signal estimate that is root-n consistent*. However, the 
problem studied by Zou addresses the overdetermined scenario where the ambient dimension N of the 
signal is fixed and the number of measurements n — > 00. In this case, Zou shows that the adaptive lasso 
enjoys the oracle properties but acknowledges that when N > n — > 00 it is nontrivial to find a consistent 
estimate for constructing the weights in the adaptive lasso. 

C. Contributions 

In this paper we adopt the weighted l\ minimization approach described by (4). Given a support 
estimate T C {1,2,..., N} for x, we set Wj = lo G [0, 1] whenever j G T, and Wj = 1 otherwise. 
Unlike Borries et al. or Vaswani et al, in our results we allow ui to be non-zero. We derive stability and 
robustness guarantees for weighted i\ minimization that generalize the results of [2]. Our results take 
into consideration the accuracy of the support estimate. In particular, we prove that if the (partial) support 
estimate is at least 50% accurate, then weighted l\ minimization outperforms standard t\ minimization 
in terms of accuracy, stability, and robustness. Finally, we note that when uj = 0, our results hold under 
weaker sufficient conditions than those in [7]. 

In Section II, we review the l\ recovery guarantees of [2]. In Section III, we state our main result and 
compare our theoretical results with standard l\ recovery as well as the results of [7], [8]. In Sections IV 

*Root-n consistency means that if x* is the solution to the adaptive lasso problem, then y/n(x* — x) — > Af(0,a 2 ) in 
distribution, where a depends on the noise variance and the covariance of the measurement matrix A. 
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and V, we present the outcome of numerical experiments on synthetic and on audio and video signals. 
We conclude with the proof of our main theorem in Section VI. 

II. Compressed Sensing Overview 

Consider an arbitrary signal x 6 M. N and let x k 6 be its best /c-term approximation. Let To = 
supp(xfc), where To C {1, . . . , N} and |To| < k. We wish to reconstruct the signal x from y = Ax + e, 
where A is a known nxN measurement matrix with n <C N, and e denotes the (unknown) measurement 
error that satisfies ||e||2 < e for some known margin e > 0. 

As we mentioned in the introduction, it was shown in [2] that x can be stably and robustly recovered 
from the measurements y by solving the optimization problem (2) if the measurement matrix A has the 
restricted isometry property (RIP), also defined by [2]. 

Definition 1. The restricted isometry constant 6 k of a matrix A is the smallest number such that for all 
fc-sparse vectors u € S^, 

(1 - S k )\\u\\ 2 2 < \\Aug < (1 + 5 k )\\u\\i (5) 

Candes et al. [2] use the RIP to provide conditions and bounds for stable and robust recovery of x by 
solving (3). 

Theorem 2 (Candes, Romberg, Tao [2]). Suppose that x is an arbitrary vector in M. N , and let x k be the 
best k-term approximation of x. Suppose that there exists an a 6 with a > 1 and 

Sak + a5(i +a ) k < a - 1. (6) 

Then the solution x* to (3) obeys 

\\x* - x\\ 2 < C e + Ci£T 1/2 ||:r - ar fe ||i. (7) 
Remark 2.1. The constants in Theorem 2 are explicitly given by 

From Theorem 2, one can see that if A satisfies (the slightly stronger condition) 

$(a+l)k < ^j-, (9) 

then the constrained t\ minimization problem in (3) recovers x with an approximation error that scales 
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well with measurement noise and the "compressibility" of x. Moreover, if x is sufficiently sparse (i.e., 
x = Xk), and the measurement process is noise-free, then Theorem 2 guarantees exact recovery of x 
from y. 

III. Compressed sensing with partial support estimation 

In this section, we present our main result showing that weighted l\ minimization can be used to stably 
and robustly recover sparse and compressible signals from noisy measurements when there is partial (and 
possibly partly inaccurate) prior support information. Our result holds under weaker sufficient conditions 
than its counterpart for l\ minimization when the support estimate is more than 50% accurate. Moreover, 
it results in smaller error bounds. We also compare our results with the modified compressed sensing 
approach proposed in [7]. 

A. Weighted l\ minimization with estimated support 

Let To be the support of x^, and let T, the support estimate, be a subset of {1,2,..., N} with cardinality 
k\ := \T\ = pk, where < p < a for some a > 1. As before, we wish to recover an arbitrary vector 
x G M N from noisy compressive measurements y = Ax + e, where e satisfies ||e||2 < e. To recover 
x € R N , we now consider the weighted l\ minimization problem with the following choice of weights: 



minimize ||z||i w subject to \\Az — j/ 1| 2 < e with w, = < 



Here, < u < 1 and ||-z||i, w is as defined in (4). Our main result follows. 



1, i G T c , 

(10) 

u, i€T. 



Theorem 3. Let x be in ~R N and let Xk be its best k-term approximation, supported on To. Let T C 
{1, . . . , N} be an arbitrary set and define p and a as before such that |T| = pk and \T D Tq\ = apk. 
Suppose that there exists an a G |Z, with a > (1 — a)p, a > 1, and the measurement matrix A has RIP 
with 

a a 

Oak + - " ; N 2 °(a+l)fc < 7 ; n2 ~ 1 ' ( U > 

+ {I - uj)^fl + p - 2ap) ( w + (i_ w ) > /r+ P -2a P y 

for some given < oj < 1. Then the solution x* to (10) obeys 

\\x* - x\\ 2 < C' e + C'^- 1 ' 2 [u\\x - x fc ||i + (1 - oj)\\x fcnTS \\^j , (12) 

where C' and C[ are well-behaved constants that depend on the measurement matrix A, the weight u, 
and the parameters a and p. 
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The proof of the theorem is presented in section VI. 

Remark 3.1. Note that the parameters in Theorem 3 specify two important ratios: p determines the ratio 
of the size of the estimated support to the size of the actual support of Xk (or the support of x if x 
is fc-sparse). On the other hand, a determines the ratio of the number of indices in supp(xfc) that were 

• — - • — - \TC\T I 

accurately estimated in T to the size of T. Specifically, a = 1 ~ 01 . 
Remark 3.2. The constants C' Q and C[ are explicitly given by the expressions 



(13) 

Consequently, Theorem 3, with cj = 1, reduces to the stable and robust recovery theorem of [2], which 
we stated above — see Theorem 2. 

Remark 3.3. It is sufficient that A satisfies 

x / j?M a - (w + (1 - wyi + p - 2a/?) 2 

*(o+i)fc < 5 l J := ; =72 (14) 

a+ (w + (1-^)^1 + p-2ap) 

for Theorem 3 to hold, i.e., to guarantee stable and robust recovery of the signal x from measurements 
y = Ax + e (with constants C' and C[ given in (13) and (14)). 

Remark 3.4. Theorems 2 and 3 guarantee stable and robust recovery for matrices A satisfying a condition 
on <5( a+1 )fc with a > 1. A slightly different approach was used by Candes [13] to handle the case a = 1. 
Candes proved that if 62k < {V% + 1) _1 > then £1 minimization (3) achieves stable and robust recovery. 
Following the same technique, with appropriate modifications to handle the weighted i\ objective, we 
can derive the analogous alternative sufficient condition 

5 2k < (V2(w + (l-u)y/T+J z 2ap) + l) 1 , (15) 

which guarantees stable and robust recovery using weighted l\ minimization (10). We omit the details 
of this calculation. 

B. Comparison to standard l\ recovery 

In this section, we compare the sufficient conditions for Theorem 3 and Theorem 2 as well as the 
associated constants of these two theorems. The following observation is easy to verify. 

Proposition 4. Let Co, Ci, C' , and C[ be as above. Then 
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Weights (co) Weights (co) 

(b) (c) 

Fig. 1: Comparison of the sufficient conditions for recovery and stability constants for weighted t\ 
reconstruction with various of a. In all the figures, we set a = 3 and p = 1. (a) 8^ vs. uj, (b) C vs. 
uj, (c) C[ vs. uj. In (b) and (c) we fix 5( a +i)fc = 0.1. 

(i) If uj = 1, f/ien C = Co, C[ = C\, and the sufficient conditions for Theorem 3, given in (11), are 
identical to those of Theorem 2, given in (6). 

(ii) If a = 0.5, then, again Cq = Co, C[ = C\, and the sufficient conditions for Theorem 3, given in 
(11), are identical to those of Theorem 2, given in (6). 

(Hi) Suppose < uj < 1. Then C < Co and C[ < C\ if and only if a > 0.5. 

Next, we illustrate how the slightly stronger sufficient conditions given in (14) and the respective 
stability constants vary with a and uj. Recall that when oj = 1, (14) reduces to (9). In Figure 1 (a), we 
plot, for different values of a, 8^ as defined in (14), versus uj, where we set the parameter a = 3. We 
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observe that as a increases the sufficient condition on the RIP constant becomes weaker, allowing for a 
wider class of measurement matrices A. For example, with a = 3, when 70% of the support estimate is 
accurate, with oj = 0.2 it suffices to have 5^ < 0.763, compared with 5^ < 0.5 for l\ minimization. 
Figures 1 (b) and (c) illustrate that for a fixed matrix A, the constants Cq and C[ decrease as a increases. 
Note that compared to setting oj = 0, assigning non-zero weights oj adds robustness to the weighted t\ 
problem in the case when a < 0.5, i.e., when we have an inaccurate support estimate T with more 
than half the entries falling outside the support of the best fc-term approximation of x. This could be 
beneficial in applications where the accuracy of the support estimates vary significantly from one signal to 
the next. Furthermore, in numerical experiments (see Section IV) we observe that using non-zero weights 
improves the quality of the reconstruction, especially in the noisy and compressible settings, not only 
when a < 0.5 but also in some cases where a > 0.5. A mathematical understanding of this behavior 
and of how to optimally choose the weight oj is beyond the scope of this paper. 

C. The zero weight case: oj = 

One special case of the weighted t\ problem that is of interest is the zero weight case, i.e., set oj = 
in (10). It can be seen from Figure 1 that recovery using weighted t\ minimization (10) achieves the 
smallest error bound constants at oj = when a > 0.5. On the other hand, the recovery performance is 
worst when oj = and a < 0.5, i.e., when the support estimate is highly inaccurate. 

Several contributions in the literature adopt the zero-weight approach, mainly in applications where 
prior support information is assumed to be highly accurate, i.e., a is close to 1, e.g., see [6], [7], [14]. 
The most recent study to address this problem is the work by Vaswani and Lu [7] where a sufficient 
condition in terms of the RIP of the matrix A is derived for exact recovery in the noise free case. Another 
work by the same authors [14] addresses the noisy case, however, the recovery algorithm in this case 
is different from (10) in that the objective function is modified to include a regularization term. The 
sufficient condition derived in Corollary 1 of [7] is expressed as 

2<5 2u + S 3u + 8 k + 8 2 k+u + 25l +2u < 1, (16) 

where u = (1 — ap)k is the size of the unkown support. Recall that a is such that apk is the size of the 
known support. 

Below we compare our condition (15) with that of [7] given in (16) for different values of the unknown 
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Fig. 2: Comparison between the values of u/k that satisfy each of the sufficient conditions (16) and (17). 
The measurement matrix has Gaussian entries with n/N = 0.5. 



support size u. We consider the case p = 1, u > 0, and n/N = 0.5. Thus, (15) reduces to 

hk < ~ J- ■ (17) 
2^/u/k + 1 

Since the two sufficient conditions, i.e., (17) and (16), are expressed in terms of RIP constants of different- 
sized submatrices of A, a simple comparison of the upper bounds is not informative. For this reeason, 
we restrict our attention to measurement matrices drawn from the Gaussian ensemble and we estimate 
the associated RIP constants (i.e., <5 2 fc, 52 U , ■ ■ •) for such matrices using the bounds derived in [15]. In 
particular, we calculate the ratios u/k that satisfy the conditions (16) and (17), respectively, and plot 
the results in Figure 2. Observe that for the same measurement matrix A and sparsity level k/n, our 
sufficient condition guarantees the recovery of fc-sparse signals with significantly less accurate prior 
support information compared to the condition of Vaswani et al. [7] . 

It is clear in Figure 2 that our recovery guarantees are superior to those of [7] at least when the aspect 
ratio of the measurement matrix is n/N = 0.5. Next, we shall focus on cases where we have a highly 
accurate estimate of the full support of the /c-sparse vector x. In other words, we set p = 1 as above and 
consider values of a that are close to 1. For these cases, we will compare our theoretical guarantees to 
those of [7] for various values of the measurement matrix aspect ratio. To that end, we observe that the 
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n/N 


k/n 


5k 




u/k 


0.1 


0.0029 


0.4343 


0.6153 


0.0978 


0.2 


0.0031 


0.4343 


0.6139 


0.0989 


0.3 


0.003218 


0.4343 


0.61176 


0.1007 


0.4 


0.003315 


0.4343 


0.61077 


0.1015 


0.5 


0.003394 


0.4343 


0.60989 


0.1023 



TABLE I: Maximum unknown support size u/k for which (17) holds while (16) fails to hold. For a 
given aspect ratio n/N, we compute the value of k/n for which 5k = 0.4343. This value, using (17), 
yields the corresponding bound on u/k. 

left-hand side of (16) is increasing in u. Thus, for any u, (16) can hold only if 

5 k + 35l<l =► 5 k < 0.4343, 

which is obtained by setting u = in (16) and observing that 5q = by definition. On the other hand, 
using the bounds from [15], we can estimate 52k and find the corresponding range of u for (17) to hold 
in the case when A is a Gaussian random matrix. The upper bound on the range of u/k for various 
aspect ratios of the measurement matrix is reported in Table I. We conclude that in various cases with 
different measurement matrix aspect ratios our theoretical results guarantee recovery while the results of 
[7] fail to provide any recovery guarantee. 

We finish this section by comparing the recovery guarantees we obtain in the zero-weight case with 
conditions that guarantee recovery via t\ minimization without using any prior support information. To 
this end, we present the phase diagrams of measurement matrices A with Gaussian entries that satisfy 
the conditions on the restricted isometry constants <5( a+1 ) fe given in (9) and (14) with u = 0, respectively. 
We use the bounds derived in [15] and plot the curves in Figure 3 for matrices satisfying the sufficient 
conditions on 64k with p = 1 and a = 0.3, 0.6, and 0.8. 

IV. Numerical Examples 

In this section, we present numerical experiments that illustrate the benefits of using weighted l\ 
minimization to recover sparse and compressible signals when partial prior support information (which 
is possibly inaccurate) is available. To that end, we compare the recovery capabilities of standard l\ and 
weighted t\ minimization for a suite of synthetically generated sparse and compressible signals. In all of 
our experiments, we use SPGL1 [16], [17] to solve the standard and weighted t\ minimization problems. 
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Fig. 3: Comparison between the phase diagrams of measurement matrices with Gaussian entries satisfying 
the sufficient recovery conditions of standard l\ minimization and weighted l\ minimization with uj = 
and a = 0.3, 0.6, and 0.8. The plots are calculated using the upper bounds on the restricted isometry 
constants derived in [15]. 

A. The sparse case 

We first generate signals x with an ambient dimension N = 500 and fixed sparsity k = 40. We 
compute the (noisy) compressed measurements of x using a Gaussian random measurement matrix A 
with dimensions nxN where we vary n between 80 and 200 with an increment of 20. In the experiments 
where the measurements are noisy, we set e = 1 1 || 2/ 20. 

Figure 4 shows the average reconstruction signal to noise ratio (SNR) over 20 experiments when using 
weighted l\ minimization depending on the number of measurements, both in the noise-free and noisy 
cases. The SNR is measured in dB and is given by 



y 1 1 Ju Jb ||2/ 

where x is the true signal and x* is the recovered signal. The recovery is done via (10) using a support 
estimate of size \T\ =40 (i.e., p = 1) where 

• the accuracy a of the support estimate ranges between zero and 1, 

• the constant weight u ranges between zero and 1 (recall that when ui = 1 (10) is equivalent to 




(18) 
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standard t\ minimization). 
Figure 4 (a) illustrates that in the noise free case, the experimental results are consistent with the 
theoretical bounds derived in Theorem 3. More specifically, it can be seen that when a > 0.5 the best 
recovery is achieved for a weight uj = whereas aw = l results in the worst SNR. On the other hand, 
when a < 0.5 the performance of the recovery algorithms is shifted towards larger values of uj in the 
severely underdetermined cases (small n). Figure 5 shows the average recovered SNR using weighted 
l\ minimization for different values of the parameter p. It is evident from the figure that using a larger 
support estimate favours better reconstruction, However, it can be seen in both the noise free and noisy 
measurement vector cases that the recovery is more sensitive to the accuracy a of the support estimate 
than its size relative to k. 

Remark 4.1. Recall from Section III-B — see Figure 1 — that when x is sparse and a > 0.5, uj = results 
in the smallest error bound constants. Otherwise, i.e., when a < 0.5, uj = 1 minimizes the error constants. 
However, this does not match entirely with our experimental observations. It can be seen from Figure 4 
(b) that, in general, the best recovery is obtained for intermediate values of uj. 

To explain this behaviour, consider the case where the measurement matrix does not satisfy the RIP 
conditions for the full recovery of a /c-sparse x via weighted l\ minimization. In such cases, x can be 
regarded as compressible: Fix k < k be such that Theorem 3 holds for all /c-sparse signals and for all 
uj G [0, 1]. Suppose T is the support of the best k term approximation of x. Then Theorem 3 guarantees 
stable and robust recovery of x where the recovery error is bounded by 

M * M ^ / ii ii it \m m\ 

\\X - X\\ 2 < ~j= {uj\\Xf c \\l + (1 - UJ)\\Xf cn f c \\l) , 

where T is the prior support estimate. Denote by a = ^jpp and note that since T C To, then a < a. 
Focusing our attention on the case when a < 0.5 (where it is observed that < uj < 1 results in the 
best recovery), we make the following observations: 

(i) The constant C[ in the error bound above increases as uj goes to zero (see Figure 1). 

(ii) Since T c n T c C T c , the term w||x^ c ||i + (1 — w)||xy cn ^ c ||i decreases as uj goes to zero. 
Therefore, for a fixed k, there exists < uj < 1 that minimizes the product of the constant C[ and the 
term w||xj; c ||i + (1 — a;) ||x^ =n ^ c Consequently, when the algorithm cannot recover the full support 
of x, an intermediate value of uj in [0, 1] may result in the smallest recovery error. A full mathematical 
analysis of the above observations needs to take into account all the interdependencies between uj, k, a 
as well as the parameters in Theorem 3 and is beyond the scope of this paper. 
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(b) 5% Noise Variance 



Fig. 4: Performance of weighted l\ recovery in terms of SNR averaged over 20 experiments for sparse 
signals x with k = 40, N = 500, while varying the number of measurements n. From left to right, 
a = 0.7, a = 0.5, and a = 0.3. 



B. The compressible case 

Next, we generate a signal x whose coefficients decay like j~ p where j G {1, ... , N} and p > 1. 
In Figure 6, we illustrate the recovered signal SNR versus the size of the support estimate for p = 1.1. 
To calculate a we set k = 40, i.e., we are interested in the best 40-term approximation. Notice that on 
average, a weight to w 0.5 results in the best recovery. This behavior is consistent with the explanation 
provided above where an intermediate value of u balances the tradeoff between the error bound constants 
and the norm of the off-support components. We repeat this experiment with p = 1.5, k = 20 and p = 2, 
k = 10. The results are reported in Figures 7 and 8, and show the same qualitative behaviour. 

V. Stylized Applications 

In this section, we apply standard and weighted l\ minimization to recover real video and audio signals 
that are compressively sampled. 
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(b) 5% Noise Variance 



Fig. 5: Performance of weighted l\ recovery in terms of SNR averaged over 20 experiments for sparse 
signals x with k = 40, N = 500, n = 100 while varying the size of the support estimate p as a proportion 
of k. From left to right, a = 0.7, a = 0.5, and a = 0.3. 

A. Recovery of video signals 

One natural application for weighted l\ minimization is video compressed sensing. Traditional video 
acquisition techniques capture a full frame (or image) in the pixel domain at a specific frame rate. The 
number of pixels acquired per image defines the spatial sampling rate, while the number of frames 
acquired per second defines the temporal sampling rate. Since the temporal sampling rate is usually high, 
a group of adjacent video frames are temporally correlated which is reflected in their spatial transform 
coefficients having nonzero entries in roughly the same locations. 

Our aim here is to reduce the number of samples acquired for each video frame while keeping the 
same reconstruction quality by recovering using weighted t\ minimization. Here, we assume that for 
every video frame j, the measurements yj, j £ {0, 1, . . . , m— 1}, are acquired by storing the readings of 
a random subset of the CCD array with m denoting the total number of frames in the video sequence. 
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Fig. 6: Performance of weighted l\ recovery in terms of SNR averaged over 10 experiments for 
compressible signals x with n = 100, N = 500. The coefficients decay with a power p = 1.1. The 
accuracy of the support estimate a is calculated with respect to the best k = 40 term approximation. 
From left to right, a = 0.7, a = 0.5, and a = 0.3. 



Let rij be the number of measurements acquired per frame j and N be the spatial resolution (number of 
pixels) to be recovered per frame. Let D be the spatial sparsifying transform. The measurement matrix 
Aj can then be written as Aj = RjD, where Rj is an rij x N restriction matrix, and D is an orthonormal 
basis. Note that the restriction matrix Rj randomly selects rij pixels from the N pixels in the CCD array 
to store their readings. 

For the first frame, j = 0, uq measurements are captured and the transform coefficients xq are recovered 
by solving the standard l\ minimization problem 

xq = argmin \\x\\i subject to Ax = i/q. 

X 

For every subsequent frame j > 1, a support estimate Vj is chosen to be the union of the locations of the 
nonzero entries of Xj-\ and Xj-2 that contribute a certain percentage of the energy of Xj-i and %-2, 
respectively. Consequently, the coefficients Xj are recovered from rij < uq measurements yj by solving 
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(b) 10% Noise Variance 



Fig. 7: Performance of weighted l\ recovery in terms of SNR averaged over 10 experiments for 
compressible signals x with n = 100, N = 500. The coefficients decay with a power p = 1.5. The 
accuracy of the support estimate a is calculated with respect to the best k = 20 term approximation. 
From left to right, a = 0.7, a = 0.5, and a = 0.3. 



the following weighted l\ minimization problem 



argmin ||x||i )W subject to Ax = yj, with Wi = < 



to. 



i £ Vj, 



where < uj < 1. 

In our experiments, we use the Foreman sequence at QCIF resolution, i.e., every frame contains 
144 x 176 pixels. We only consider the luma (grayscale) component of the sequence. Every frame is split 
into four blocks, each of size N = 72 x 88 which are processed independently. We set uq = N/2 and 



n ; 



N/2.2 and rij = N/2A for j > 1. The two dimensional discrete cosine transform (DCT) is used 



as the spatial sparsifying basis allowing for the support estimate Vj to include the DC component and 
the union of the AC coefficients that contribute to 97% of the energy in the AC coefficients of each of 
Xj-i and Xj-2- The signals Xj are then recovered using weighted t\ minimization for oj equal to 0, 0.1, 
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Fig. 8: Performance of weighted l\ recovery in terms of SNR averaged over 10 experiments for 
compressible signals x with n = 100, N = 500. The coefficients decay with a power p = 2. The 
accuracy of the support estimate a is calculated with respect to the best k = 10 term approximation. 
From left to right, a = 0.7, a = 0.5, and a = 0.3. 



0.5, and 1. 

Figure 9 illustrates the recovery of the first 30 frames of the Foreman sequence using weighted t\ 
minimization. The reconstruction quality is reported in terms of the peak signal to noise ratio (PSNR) 
given by the expression 

(N x 255 2 \ 
T, mTT • (!9) 
\\%-x\\iJ 

The figure demonstrates that recovery with to = 0.5 results in an improvement in PSNR averaging around 
1 dB compared to standard l\ using the same number of measurements. A striking observation is that 
weighted i\ minimization outperforms standard l\ also with fewer measurements, i.e., in the case where 
rij = no, Vj for standard l\, whereas rij = no/2.2 for weighted l\. 
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Fig. 9: Recovery of the first 30 frames of the Foreman sequence at QCIF resolution. The first frame 
is recovered from no = N/2 measurements, while the remaining frames are recovered from (a) nj = 
N/2.2 and (b) rij = N/2A measurements. Recovery is performed using weighted l\ minimization with 
uj G {0, 0.1, 0.5, 1}. The support estimate is derived from the union of the supports of the previous two 
frames. The black curve corresponds to the recovered PSNR using standard t\ minimization with a fixed 
number of measurements rij = no, Vj G {1, . . . , 30}. 



B. Recovery of audio signals 

For our second stylized application, we examine the performance of weighted l\ minimization for the 
recovery of compressed sensing measurements of speech signals. In particular, the original signals are 
sampled at 44.1 kHz, but only l/4th of the samples are retained (with their indices chosen randomly 
from the uniform distribution). This yields the measurements y = Rs, where s is the speech signal and 
R is a restriction (of the identity) operator. Consequently, by dividing the measurements into blocks of 
size N, we can write y = [yj ,y%, ...] T '. Here each yj = RjSj is the measurement vector corresponding 
to the jth block of the signal, and Rj G M. n i xN is the associated restriction matrix. The signals we use 
in our experiments consist of 21 such blocks. We make the following assumptions about speech signals: 

1) The signal blocks are compressible in the DCT domain (for example, the MP3 compression standard 
uses a version of the DCT to compress audio signals.) 

2) The support set corresponding to the largest coefficients in adjacent blocks does not change much 
from block to block. 

3) Speech signals have large low-frequency coefficients. 

Thus, for the reconstruction of the jth block, we choose the support estimate T = T 1 U T 2 . Here, T 1 
is the set corresponding to frequencies up to 4kHz and T 2 is the set corresponding to the largest rij/lQ 
recovered coefficients of the previous block (for the first block T 2 is empty). The results of experiments 
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on two speech signals (one male and one female) with N = 2048, and u £ {0, 1/6, 2/6, . . . , 1} are 
illustrated in Figure 10. 




1/6 2/6 3/6 4/6 5/6 1 



Fig. 10: SNRs of two reconstructed signals (male and female voices) from compressed sensing 
measurements plotted against uj. For both speech signals, an intermediate value of lo yields the best 
performance. 



VI. Proof of Theorem 3 

Recall that T, an arbitrary subset of {1,2, .. . , N}, is of size pk where < p < a and a is some 
number larger than 1. Let the set f a = T n f and fp = T§ n T, where \ f a \ = a\f\ = apk and 
a + j3 = 1. Figure 11 illustrates these sets and shows the relationship to the weight vector w. 



x : 



To 


i '0 ► 








WKmk 








T n T 


f n T C 





w : 



1 0<o)<l 1 

Fig. 1 1 : Illustration of the signal x and weight vector w emphasizing the relationship between the sets 
T and f . 



July 26, 2011 



DRAFT 



21 



Let x* = x + h be a minimizer of the weighted t\ problem (10). Then 

\X + /i||l, w < ||x||l )W . 

Moreover, by the choice of weights in (10), we have 

w\xf + hf\\ + \\xf c + /i^Jli < + ||xj; c ||i. 

Consequently, 

H x f-nT + hf criTo \\i + H x f c nT c + ^f^nT^H 1 + w ll x fnT + ^TnToH 1 + w ll x fnT c + ^"fnT^H 1 

< ||Xji cnr J|l + \\Xf cnT c ||l + Uj\\Xf nTg ||l + w||Xy nT e||l. 

Next, we use the forward and reverse triangle inequalities to get 

u\\hf nTS \\i + \\h fcnTs \\i < ||/ifc nT J|i +w||/if nTo ||i + 2 (||^c nT c||i +w||x frtz? ||i) . 
Adding and subtracting co\\hf cnTC ||i on the left hand side, and oj\\hf cnTg ||i on the right, we obtain 

^H^fnT^H 1 + w ll /l f-nT ( fH 1 + H^f-nT^H 1 ~~ w ll^f=nT c II 1 

< ^\\hf nTo \\i + ^\\h fcnTo h + \\h fcnTo \\i ~ w||/ifc nTo ||i 

+ 2 (^IkfnT-ll 1 + ^ll^nT^H 1 + ll^f-nT.fll 1 ~ w ll x f<=nT c II 1 ) ■ 
Since ||/ir c ||i = ll^fnT^I 1 + ll^f^nr^l 1 ' tms easn y reduces to 

w ll /i T c lli + ( 1 - w )ll /l f=nT,flli ^ w ll /l Tji + ( 1 - w )ll%nT Hi + 2 (w|kT «||i + (1 - ^)\\xf cnT .\\i) ■ (20) 
But, we can also write 

\\h T g\\i = u\\hT S \\i + {1 - u)\\h fer(rs \\ 1 + (1 - w)||/i fnT(f ||i. 
Combining the above with (20), we obtain 

=> HH c lli ^ "ll^o IK + C 1 - w )ll /l f<=nT H 1 + U - w )ll /i fnT =Hi + 2 ( w ll x ^||i + (1 - u)\\xf era ^\\i 



= w\\hr \\i + (1 - w) (ll/ifcnrjli + ll^rn^lli) + 2 (Ml^lli + (1 - w)||^ cnT c||i) . 
Since, the set T a = TqDT, we can write ||^fc nTo ||i + H^nT^I 1 = II^t ut\t II 1 anc * s i m plify the bound 
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on || /it c ||i to the following expression: 

\\hr§\\i < u\\h To \\i + (1 - ^)ll /i T uf\fJli + 2 (wll^lli + (! - wJllzfcnrflli) ■ ( 21 ) 

Next we sort the coefficients of hxg partitioning T§ it into disjoint sets Tj,j G {1, 2, . . .} each of size 
ak, where a > 1. That is, T\ indexes the ak largest in magnitude coefficients of hxg, 2~2 indexes the 
second ak largest in magnitude coefficients of hxg, and so on. Note that this gives hxg = X^>i^t 3 , 
with 

\\h Tj h < v^I^tJoo < {akr^Whr^h- (22) 

Let Tqi = To U Ti, then using (22) and the triangle inequality we have 

\\h TSl h < E \\h Tj h < (ak)- 1 / 2 £ ||^ ||i 

i>2 j>i (23) 

< (a£0- 1/2 HH<lli- 
Combining the above expression with (21) we get 

\\h TSl h < (ak)- 1 / 2 (u\\hr \\i + (1 - w)K oU f\fJli + 2 + (1 - w)||*fcrtz?lli)) • (24) 

Next, consider the feasibility of x* and x. Both vectors are feasible, so we have ||A/i||2 < 2e and 

\\Ah Toi || 2 < 2e + p/i^ || 2 < 2e + £ p/t T . || 2 

i>2 

< 2e + vT + ^E ll^lh- 

i>2 

From (23) and (24) we get 

< 2 ^ + 2^^ (wll^Hi + (1 -")||s fcnToC ||i) 
+ -^IIHIIi + (l--)^||^ oU ^J|i 
Noting that \T U f \ f a \ = (1 + p - 2ap)k, 

y/l-6(a + i)k\\hT B1 h <2e + 2^^( W ||x ToC || 1 + (l-a;)||x fcnToC || 1 ) 

+ "^Whnh + (1 " u;)^Vl + p-'^p\\h ToUf ^J 2 . 
Since the set T\ contains the largest ak coefficients of hx§ with a > 1, and |T \ T a \ = (1 — a)p/c < afc, 
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then ||/i To ut\tJI 2 ^ ||^T 01 lb- We also have \\hr \\2 < ||^T 01 ||2, thus 

2e + 2 " Z vS S ( w ll x ^lli + ( 1 -^)ll^nT (f lli) 

l|/lTo1 " 2 " H X c + (l-a;)7I+7>^ /VTlr - • (25) 
V 1 ~ S (a+l)k " 7^ -V 1 + C'afc 

Finally, using ||/i||2 < ||/ir 01 ||2 + H^T^Ib, we combine (24) and (25) to get 

(-, . cj+(l-wWl+p-2ap\ . v/l-5 (a+ i )fc +Vl+<5afc / I, |, x 1 1 n \ 

2(1 + % P j^+2 V ^ ^ll^lli + a-^llx^llij 

l|/l1 ' 2 - 7l Z . +{ i-^)VTT^- p rrr^ ' (26) 

with the condition that the denominator is positive, equivalently 

a a 
(w + (l-a;)Vl + p-2ap) 2 (w + (1 - w) >/I + p - 2ap)' 

□ 
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